Libris Britannia 4

home *** CD-ROM | disk | FTP | other *** search

/ Libris Britannia 4 / science library(b).zip / science library(b) / DDJMAG / DDJ8801.ZIP / NARO.ZIP / NARO.DEC < prev next >

Wrap

Text File | 1987-10-07 | 20KB | 181 lines

\arttl\Putting ROM Code in Its Place \dek\A DOS locate utility {PULLQUOTES} {MS/pg. 2} To support relocation, the .EXE file is partitioned into two components\sc92\a header containing the relocation information and the actual binary load module. {MS/pg. 4} For embedded systems, a special type of loader called a locator is required. \artby\Rick Naro \adline\Rick Naro, Recycled Software, 45 Pleasant St., Milford, MA 01757. Rick is a field sales engineer for NEC Electronics. He can be reached via MCI Mail, CompuServe (73047,3031), or BIX (naro). \drop3\U\mc\ntil recently, developers programming for EPROM-based applications didn't have a utility that would take the standard output of MS-DOS tools and spit out a file suitable for burning into EPROM. After spending the last few months longing to use the latest in compilers, such as Microsoft's Quick C or Borland's Turbo C, I decided that perhaps the best tack to take would be to develop my own locate utility. Although it's an easy matter to put the contents of a .COM file into an EPROM, programs in .EXE format are a different animal altogether. Unlike the binary image of a program found in a .COM format, a .EXE file is a relocatable object module, which requires that the segment references in a program be relocated or adjusted. Before the .EXE file can be run, a loader must convert the relocatable format into an executable, absolute object module. In the typical MS-DOS system, program loading and segment fix-up is performed by COMMAND.COM and as such is transparent to the user. The COMMAND.COM loader simply reads the relocatable object module into memory, performs the segment fix-up by adjusting the segment references relative to the base segment of the load module, then transfers control to the program. To support relocation, the .EXE file is partitioned into two components\sc92\a header containing the relocation information and the actual binary load module. Both the memory requirements for the program and the initial register values are found in the header, so the first step of loading the binary image is easy. COMMAND.COM simply requests a suitably sized memory block in which to place the load module, then reads the binary image into that block of memory. Once loaded into memory, segment references are fixed relative to the base segment, using the segment fix-up records. When a programmer codes the following instruction sequence, for example, the assembler and linker cannot determine the final segment value to be used for the \ft2\data \ft1\segment, and the fix-up is left for the loader to perform: \dn11\\text\mov ax, data ; Load ax «MDNM»with the data «MDNM»segment \text\mov ds, ax ; Store in ds \dn11\\text\Instead, the linker inserts the offset (or virtual segment) of the segment \ft2\data \ft1\from the base of the load module into the binary object module and inserts an entry in the segment fix-up record pointing to the segment reference requiring fix-up. If the program base is segment \ft2\3000h \ft1\and the virtual segment of \ft2\data \ft1\is \ft2\1234h\ft1\, the loader will perform the fix-up by adding the two and overwriting the segment offset with the sum \ft2\4234h \ft1\(which is physical segment for the segment \ft2\data \ft1\in this instance). Once all segment fix-ups are processed, the loader can transfer control to the new program. \dn11\\head2\An MS-DOS Locator\mc\For embedded systems, a special type of loader called a locator is required. A loader is distinguished from a locator in two ways: by output destination and by the organization of the absolute object module. Although a loader is designed to write the absolute object module directly to memory (for immediate execution), the output of a locator is an Intel extended hex file suitable for EPROM burning. Another important feature of a locator is its ability to rearrange segments in arbitrary addresses to reflect the physical organization of the target system. A typical embedded system normally contains EPROM at the upper addresses for the program code and RAM at the lower addresses for data, interrupt vectors, and the stack. Because this organization is incompatible with the contiguous MS-DOS absolute object module, relocating the segments to new addresses is crucial to the operation of the locator. When the locator has finished processing a .EXE file, ROMable code and data will be fixed at addresses in the EPROM address space and volatile data and the stack will be fixed in the RAM address space and the segment fix-ups adjusted to reflect the rearrangement of segments. By itself, the .EXE file header contains insufficient information for relocation, so the segment map of the program along with instructions on where the segments will be placed in the target system is required. The segment map is prepared by the linker and identifies each segment by name, class, length, and its position within the binary load module. The user must also prepare a configuration file describing the characteristics of the target system and the physical addresses that the program segments will bound. Using both the map and configuration files, the locator can extract and physically relocate the segments to build the ROMable load module. Although a programmer is normally concerned with segments, they are far too numerous and varied in name to be of much use to the locator. Instead, the locator works with classes. A class is simply a tag applied to a segment by the assembler or compiler. This tag permits the linker to group a segment with other related segments. For example, each separately compiled source file in the large-memory model will generate a uniquely named code segment, but all such segments will belong to the \ft2\code \ft1\class. Using the locator directives, a programmer can fix the address of any class and specify the order of a set of classes, configuring the absolute object code for any target hardware. The locator also needs to process the segment fix-ups but in a slightly different manner from the way the loader does. Each segment listed in the segment map is given an entry in a linked list that contains its segment name, length, virtual segment number, and physical segment number organized by class name. When the configuration file is read, the base segment used for a class is fixed and all physical segment numbers are adjusted by adding this value to the segment offset within the class. Segment fix-up then can proceed with the virtual segment number from the fix-up record used to scan the linked list looking for a match. If found, the corresponding physical segment number is returned and used in the fix-up; otherwise, an unresolved segment error is reported. Although the location process sounds simple, there are two pitfalls that must be avoided. As noted previously, an MS-DOS .EXE file is designed to be executed from a contiguous block of memory whereas embedded systems typically have a fragmented address space with pockets of RAM and EPROM placed at the whim of the hardware designer. A potential problem exists in that two adjacent segments in different classes can share a common virtual segment and then be located noncontiguously when the segments are extracted. Because the virtual-to-physical-segment translation in this instance is ambiguous, a situation known as segment aliasing results. Segment aliasing can be avoided by guaranteeing that two segments in different classes never share a common virtual segment. This is easily accomplished by verifying that the first segment in a class is paragraph-aligned or that each segment spans a paragraph boundary. There is also a potential problem is using groups. A group is a collection of unrelated segments that are organized to fit within, and be addressed as, a single physical segment. Some linkers, such as the Microsoft linker, don't include sufficient information for the locator to reconstruct a group. If groups are used, the user must have information on the organization of the group and include instructions in the configuration file to permit its reconstruction. This is accomplished by using the locate directives to fix the address of the first class in the group and then order the remaining classes in the group as specified by the compiler vendor. \dn11\\head2\LOCATE\mc\LOCATE is an MS-DOS utility that accepts a relocatable .EXE file and outputs an absolute object module suitable for burning in EPROM. The source code and a make file for building LOCATE can be found in the accompanying listings (beginning on page xx). Because each application is unique, LOCATE uses several directives to control the location process. These directives are used to identify ROMable classes, assign physical segments to classes, and specify the order of classes in the absolute load module. Some directives accept a list of one or more operands. The \ft2\[ \ft1\and \ft2\] \ft1\characters are used whenever an operand is optional and can be repeated zero or more times. Unless otherwise specified, directives and operands are delimited by white space. The default configuration file has the file name of the input .EXE file with an extension of .CFG. Using a command-line option, the default file name can be overridden and any file can be specified to contain the configuration instructions. This option allows multiple load modules to share a common configuration file. \dn11\\head3\Class Directive\mc\The \ft2\class \ft1\directive assigns a physical segment to a class. The first segment in the specified class is assigned the base segment number and the remaining segments in the class are assigned segments relative to the first segment in the class. These segments depend on the length of the preceding segments and the segment alignment. The \ft2\class \ft1\directive uses the following syntax: \dn11\\text\class class = seg \dn11\\text\where \ft2\class \ft1\is the name of the class and \ft2\seg \ft1\is the 16-bit physical segment where the class will be located. For example: \dn11\\text\class code = 0xfc00 \dn11\\text\assigns the class \ft2\code \ft1\to segment \ft2\fc00h \ft1\and therefore the physical address \ft2\fc00h\ft1\. \dn11\\head3\Order Directive\mc\The \ft2\order \ft1\directive is used to specify the ordering of two or more classes. It is important because it allows unrelated classes to be made contiguous without firsthand knowledge of the size and number of segments in the class. The \ft2\order \ft1\directive uses the following syntax: \dn11\\text\order class [class] \dn11\\text\where the first class in the list was specified in a \ft2\class \ft1\directive. Any class names specified after the first are located contiguously and aligned to the segment boundary of the first segment in each class. For example: \dn11\\text\order code data bss \dn11\\text\orders the classes \ft2\data \ft1\and \ft2\bss \ft1\immediately following the class \ft2\code\ft1\. \dn11\\head3\Dup Directive\mc\The \ft2\dup \ft1\directive is used to make a copy of the specified class. If used, the \ft2\dup \ft1\directive should appear before any other directives. The \ft2\dup \ft1\directive uses the following syntax: \dn11\\text\dup class dup_class \dn11\\text\where \ft2\class \ft1\is an existing class and \ft2\dup_class \ft1\is the name given to the copy of \ft2\class\ft1\. For example, the directive: \dn11\\text\dup data const \dn11\\text\makes a copy of the \ft2\data \ft1\class named \ft2\const\ft1\. This command is used in conjunction with the \ft2\order \ft1\directive to locate the \ft2\data \ft1\and \ft2\bss \ft1\classes in RAM but force a copy of the class \ft2\data \ft1\to be included in EPROM for power-on initialization of any initialized data. If the class \ft2\data \ft1\contains the initialized data from a compiler, the following commands will locate \ft2\data \ft1\at address \ft2\1000h \ft1\and create a copy of \ft2\data \ft1\called \ft2\const \ft1\to be placed after the \ft2\code \ft1\class. The start-up code can then initialize the class \ft2\data \ft1\by copying the \ft2\const \ft1\class to the \ft2\data \ft1\class. \dn11\\text\dup data const ; Copy the class and call it const \text\.\sc128\.\sc128\. \text\class data = 0x100 ; Fix data at address 01000h \text\class code = 0xfc00 ; Fix code at address fc000h \text\.\sc128\.\sc128\. \text\order code const ; const to immediately follow code ; And read by the startup code \dn11\\head3\Rom Directive\mc\The \ft2\ROM \ft1\directive is used to specify which classes are ROMable. Classes containing program code and constant data need to be located and placed in ROM to be available when the system is powered up and initialized. Other classes such as uninitialized data and the stack require only to be located at a physical address and do not need to be placed in the output file. The \ft2\rom \ft1\directive uses the following syntax: \dn11\\text\rom class [class] \dn11\\text\For example: \dn11\\text\rom code const \dn11\\text\forces the classes \ft2\code \ft1\and \ft2\const \ft1\to be placed in the output object file. \dn11\\head3\Comments\mc\To aid in documenting the location process, comments can be added to the tail of any command line or as a separate line. Comments begin with a semicolon (\ft2\;\ft1\) and continue to the end of the line. Blank lines and comments can appear freely within the configuration file for documentation and readability. \dn11\\head3\Options\mc\In order to provide a degree of flexibility, the LOCATE utility can accept command-line options (or switches if you prefer) that influence the operation of the locator. Command-line options are lowercase letters introduced with a leading dash (\sc0\) with no white space between the option letter and the argument. Some examples of LOCATE command lines are: \dn11\\text\locate \sc0\b hello \text\locate \sc0\b \sc0\ccommon.cfg hello \text\locate \sc0\hhello.hx \sc0\b hello \dn11\A command line begins with \ft2\locate\ft1\, is followed by zero or more options, and is terminated with the path name of the file to be located. In the following descriptions, left and right brackets (\ft2\[ \ft1\and \ft2\]\sc128\\ft1\) are used to denote mandatory arguments. \dn11\\text\\ft2\\sc0\b\ft1\\sc92\The default setting for LOCATE is to generate an Intel extended hex start address record containing the entry point of the program. By specifying the \ft2\\sc0\b \ft1\option, LOCATE will create an absolute segment at address \ft2\ffff:0 \ft1\and place an intersegment jump instruction to the entry point of the program. \text\\ft2\\sc0\c[filename]\ft1\\sc92\specifies a different configuration file. The default configuration file is \ft1\filename .CFG\ft1\, where \ft2\filename \ft1\is the name of the load module. One use of this option is to allow different object modules to be located using a shared configuration file. \text\\sc0\\ft2\h[filename]\ft1\\sc92\changes the name of the Intel extended hex output file. Normally, the output is placed in a file with the same name as the .EXE input file and a default extension of .HEX. \text\\ft2\\sc0\p[filename]\ft1\\sc92\changes the name of the locate map file containing the segment assignments and public symbols. Normally, the locate statistics are placed in a file with the same name as the .EXE input file and a default extension of .LOC. \dn11\\head2\LOCATE Example\mc\To demonstrate the use of LOCATE, I'll now discuss an example that uses the Turbo C compiler from Borland. Using the large-memory model, the program in Example 1, page xx, loads several different code and data segments that can then be processed by LOCATE. The compiled C source and Turbo C run-time routines are linked with the Turbo C assembly-language start-up code, TC.ASM. The start-up code together with the \ft2\order \ft1\and \ft2\dup \ft1\directives demonstrates how the locator can initialize the \ft2\data \ft1\class and zero out the \ft2\bss \ft1\class. Note that caution must be exercised to make sure that the standard library functions are compatible with non-MS-DOS systems. As shown in Example 2, page xx, you begin by compiling the C source and the MASM start-up module. The Turbo C options used are to disable linking (\ft2\\sc0\c\ft1\) and select the large-memory model. I disable the automatic link following the compile so that I can substitute a ROMable version of the C start-up. The Turbo C large-memory-model library is searched to satisfy the external reference to the \ft2\strcpy(\sc128\) \ft1\function, as follows:«MDNM» \dn11\\text\C>tlink /m tc demo, demo, demo, \sc168\turboc\sc168\lib\sc168\cl \text\Turbo Link Version 1.0 Copyright (c) 1987 Borland International \dn11\\text\For reference, the linker map file for this example is reproduced in Example 3, page xx. Note the segment and class assignments and watch how the locator processes and converts the executable image to a ROMable image. The configuration file for the example (see Example 4, page xx) must be able to handle the group and the initialized data generated by Turbo C. LOCATE is instructed to make a copy of the \ft2\data \ft1\class that contains the program initialized data. Next the base segments of the three independent classes (\ft2\code\ft1\, \ft2\data\ft1\, and \ft2\stack\ft1\) are specified using the \ft2\class \ft1\directive. The \ft2\order \ft1\directive is used to recreate the Turbo C \ft2\dgroup \ft1\and fix the copy of the initialized data segment immediately following the \ft2\codeend \ft1\class. With it tucked nicely in EPROM and its physical address determined by the \ft2\tend \ft1\label, the start-up code can copy the class \ft2\const \ft1\to the \ft2\data \ft1\class before calling \ft2\main(\sc128\)\ft1\. LOCATE is then executed to process the .EXE file and output the absolute load module, as follows: \dn11\\text\C>locate \sc0\b demo \text\MS-DOS Locate Utility - Version 1.0 \text\Copyright (C) 1987 Rick Naro. All rights reserved \dn11\«MDNM»The output from the locator, shown in Example 5, page xx, is an Intel extended hex file and a segment map detailing the new physical address assignments. The locate map also contains a list of public symbols for use in debugging the target system. Note how the segments and classes have been relocated according to the instructions in the configuration file and correspond to the addresses of the target hardware. The file DEMO.HEX (in Example 6, page xx) is now ready to be sent to the EPROM programmer. In an 8-bit system, the data is burned directly into one or more EPROMs. In a 16-bit bus system, the EPROM programmer must be used to split the load module into upper and lower bytes for programming the upper and lower bytes in separate EPROMs. \dn11\\head2\Summary\mc\Although a simple example, the sample program demonstrates the power and flexibility of turning a low-cost PC into a powerful, embedded, system development tool for the NEC and Intel microprocessors. With access to a wide range of popular software development tools, program development for embedded systems has never been easier. \dn11\\head2\Availability\mc\Allthe source code for articles in this issue is available on a single disk. To order, send $14.95 to \ft2\Dr. Dobb's Journal\ft1\, 501 Galveston Dr., Redwood City, CA 94063, or call (415) 366-3600, ext. 216. Please specify the issue number and format (MS-DOS, Macintosh, Kaypro). \ft3\DDJ (Listings begin on page xx.) \ft1\Vote for your favorite feature/article. Circle Reader Service \ft3\No. x. \text\\ft4\Example 1: \ft2\The demonstration program \ft4\Example 2: \ft2\Compiling the C source code and the MASM start-up module \ft4\Example 3: \ft2\The linker map file \ft4\Example 4: \ft2\The configuration file \ft4\Example 5: \ft2\The ouput from the locator \ft4\Example 6: